---
title: "Media | Public Sentiment Exhibits"
output: html_document
#editor_options: 
#  chunk_output_type: console
editor_options: 
  chunk_output_type: console
---

This file replicates most of the findings in Ploger et al., "Time-Varying Relationships Between the ‘Sentiment’ of U.S. Public Opinion and News Media." Note that we do not provide code here to reproduce Figure 1, Table 1, or the ANOVA for the regression model of sentiment on date and house. Those analyses rely on proprietary data from Gallup which cannot be shared publicly. We also cannot provide the raw media content used to capture sentiment, due to copyright restrictions. We do however provide the derived time series alongside this replication file. Additionally, we provide all of the raw polls that were not shared directly by Gallup ("plogeretal_ijpor_rawpolls_data_nogallup.csv") and the script used to clean and process those data into the measure used here ("plogeretal_ijpor_rawpolls_processing.Rmd").

Note that we store data in one data frame, "sentiment.Ploger.Rdata. The variables in this data frame are as follows:

time:	time (count variable)
year:	year
month:	month
media:	Net Media Sentiment 
public:	Net Public Sentiment
pos.all:	positive words, All outlets
pos.abc:	positive words, ABC
pos.cbs:	positive words, CBS
pos.nbc:	positive words, NBC
neg.all:	negative words, All outlets
neg.abc:	negative words, ABC
neg.cbs:	negative words, CBS
neg.nbc:	negative words, NBC
lsd.all:	LSD net sentiment, All outlets
lsd.abc:	LSD net sentiment, All outlets
lsd.cbs:	LSD net sentiment, All outlets
lsd.nbc:	LSD net sentiment, All outlets

Note also that this file relies on the rStata package to estimate time series models in STATA and import them to R. The Stata version may have to be changes depending on what is running on your computer; and the analyses can of course be replicated entirely R, albeit with a little more code to produce Prais-Winsten estimates in R.


### Prep

```{r libraries, include=F}

library(tidyr)
library(car)
library(dplyr)
library(foreign)
library(effects)
library(RStata)
options("RStata.StataPath" = '/Applications/Stata/StataMP.app/Contents/MacOS/stata-mp')
options("RStata.StataVersion" = 15)

```

```{r loading data}

load("plogeretal_ijpor_sentiment.Rdata")

```


### Media content

```{r graphing media sentiment, echo=F}

{
png("figure2_final.png", width = 10, height = 7, units="in", res=600, bg="white")
par(mar = c(4.1, 4.1, 1.1, 5.1))
plot(A$time,A$lsd.all,type="n", ann=F, axes=F, ylim=c(-.65,.35))
axis(1,lwd=1.75,col="darkgrey",cex.axis=1.1, at=c(1,61,121,181,241), labels=c(2000,2005,2010,2015,2020))
axis(2,lwd=1.75,las=1,col="darkgrey",cex.axis=1.1,pos=-3)
points(A$time,A$lsd.abc, col="gray80", cex=.7, pch=1)
points(A$time,A$lsd.cbs, col="gray50", cex=.7, pch=1)
points(A$time,A$lsd.nbc, col="gray50", cex=.7, pch=0)

lw1 <- loess(lsd.all ~ time,data=A,span=.05)
A$lw1 <- lw1$fitted
lines(A$time,A$lw1, col="black", lwd=3)

title(ylab="Media Sentiment",line=2.25,cex.lab=1.3)
legend(5,.3,legend=c("Combined Network Evening News Sentiment","ABC","CBS","NBC"),pch=c(NA,1,1,0),lty=c(1,NA,NA,NA),lwd=c(3,NA,NA,NA),col=c("black","gray80","gray50","gray50"),cex=1.05,box.lty=0,pt.cex=1.5)
invisible(dev.off())
}

```

```{r looking at the relationship between the series}

c1 <- cor(A$lsd.abc,A$lsd.cbs,use="complete.obs")
c2 <- cor(A$lsd.nbc,A$lsd.cbs,use="complete.obs")
c3 <- cor(A$lsd.nbc,A$lsd.abc,use="complete.obs")
mean(c1,c2,c3)

media1 <- A[,c("lsd.abc","lsd.cbs","lsd.nbc")]
media1 <- drop_na(media1)
f_all <- factanal(x = media1, factors = 1, scores="regression")
f_all

```

```{r anova of date and network, echo=F}

abc <- A[,c("year","month","lsd.abc")]
abc$network <- "abc"
cbs <- A[,c("year","month","lsd.cbs")]
cbs$network <- "cbs"
nbc <- A[,c("year","month","lsd.nbc")]
nbc$network <- "nbc"
colnames(abc) <- c("year","month","lsd","network")
colnames(cbs) <- c("year","month","lsd","network")
colnames(nbc) <- c("year","month","lsd","network")
med <- rbind(abc,cbs,nbc)
med$time <- as.factor(med$year+(.01*med$month))

medfac <- lm(lsd ~ time + network -1, data=med)
Anova(medfac)

25.1411/(25.1411+0.3714+3.9681)
0.3714/(25.1411+0.3714+3.9681)

```


### Public & Media

```{r correlations in levels}

cor.test(A$media,A$public)
cor.test(A$media[A$year>2009],A$public[A$year>2009], use="complete.obs")

```

```{r basic ts models }

{
stata_src <- "
        tsset time
        regr public time if year<2009
        predict RpublicT, residuals
        sum public if year>2008
        gen publicT=public+1.23
        replace RpublicT=publicT if year>2008
        egen stdRpublicT=std(RpublicT)
        quietly regress d.media l.media d.stdRpublicT l.stdRpublicT
        estimates store A
        quietly regress d.stdRpublicT l.stdRpublicT d.media l.media 
        estimates store B
        quietly regress d.media l.media d.stdRpublicT l.stdRpublicT if year>2008
        estimates store C
        quietly regress d.stdRpublicT l.stdRpublicT d.media l.media  if year>2008
        estimates store D
        esttab A B C D, cells(b(star fmt(3)) se(par fmt(3))) nogaps star(a 0.10 * 0.05 ** 0.01 *** 0.01) r2"
stata(stata_src,data.in=A)
}

```

```{r stata rolling public as dv}

stata_src <- "
 tsset time
 regr public time if year<2009
 predict RpublicT, residuals
 sum public if year>2008
 gen publicT=public+1.23
 replace RpublicT=publicT if year>2008
 egen stdRpublicT=std(RpublicT)
 rolling _b _se, window(60) clear: regr d.stdRpublicT l.stdRpublicT d.media l.media
 saveold \"stata.out1.dta\", replace version(12)
"

stata(stata_src,data.in=A,stata.echo=T)

```

```{r stata media as dv}

stata_src <- "
 tsset time
 regr public time if year<2009
 predict RpublicT, residuals
 sum public if year>2008
 gen publicT=public+1.23
 replace RpublicT=publicT if year>2008
 egen stdRpublicT=std(RpublicT)
 rolling _b _se, window(60) clear: regr d.media l.media d.stdRpublicT l.stdRpublicT
 saveold \"stata.out2.dta\", replace version(12)
"

stata(stata_src,data.in=A,stata.echo=T)

```

```{r importing rolling results}

out1 <- read.dta("stata.out1.dta") #public as dv
out2 <- read.dta("stata.out2.dta") #media as dv

colnames(out1) <- c("start", "end", "l.public", "d.media", "l.media", "cons", "l.public.se", "d.media.se", "l.media.se", "cons.se")
colnames(out2) <- c("start", "end", "l.media", "d.public", "l.public", "cons", "l.media.se", "d.public.se", "l.public.se", "cons.se")

d <- A[,c("time","year","month")]
d$ym <- d$year + (d$month*.01)
out1 <- merge(out1,d,by.x="start",by.y="time",all.x=T)
out2 <- merge(out2,d,by.x="start",by.y="time",all.x=T)

```

```{r rolling coef graphics dv ch media iv ch public, echo=F}

out2$lower <- out2$d.public - (1.9*out2$d.public.se)
out2$higher <- out2$d.public + (1.9*out2$d.public.se)
poly <- c(out2$start,rev(out2$start))
gon <- c(out2$lower,rev(out2$higher))
polyg1 <- as.data.frame(cbind(poly,gon))

out2$lower <- out2$l.public - (1.9*out2$l.public.se)
out2$higher <- out2$l.public + (1.9*out2$l.public.se)
poly <- c(out2$start,rev(out2$start))
gon <- c(out2$lower,rev(out2$higher))
polyg2 <- as.data.frame(cbind(poly,gon))

{
png("figure3_final.png", width = 10, height = 13, units="in", res=600, bg="white")

par(mfrow=c(2,1))
par(mar=c(5.1, 5.1, 1.1, 2.1))

plot(out2$start,out2$public, type="n", ylim=c(-1,1),ann=F, axes=F)
abline(h=0, col="gray", lty=2)
polygon(polyg1$poly,polyg1$gon,border=NA,col="gray",density=36,angle=90)
lines(out2$start,out2$d.public, lwd=2)
axis(1, labels=out2$ym[c(seq(1,200,by=12))], at=c(seq(1,200,by=12)))
axis(2,las=1)
mtext("Coefficient: Ch. Public(t) Effect on Ch. Media(t)",side=2, line=3.5)
#mtext("Start date (for 60-month window)",side=1, line=2.5)

plot(out2$start,out2$l.public, type="n", ylim=c(-1,1),ann=F, axes=F)
abline(h=0, col="gray", lty=2)
polygon(polyg2$poly,polyg2$gon,border=NA,col="gray",density=36,angle=90)
lines(out2$start,out2$l.public, lwd=2)
axis(1, labels=out2$ym[c(seq(1,200,by=12))], at=c(seq(1,200,by=12)))
axis(2,las=1)
mtext("Coefficient: Public(t-1) Effect on Ch. Media(t)",side=2, line=3.5)
mtext("Start date (for 60-month window)",side=1, line=2.5)

invisible(dev.off())
}

```

```{r rolling coef graphics dv ch public iv ch media, echo=F}

out1$lower <- out1$d.media - (1.9*out1$d.media.se)
out1$higher <- out1$d.media + (1.9*out1$d.media.se)
poly <- c(out1$start,rev(out1$start))
gon <- c(out1$lower,rev(out1$higher))
polyg1 <- as.data.frame(cbind(poly,gon))

out1$lower <- out1$l.media - (1.9*out1$l.media.se)
out1$higher <- out1$l.media + (1.9*out1$l.media.se)
poly <- c(out1$start,rev(out1$start))
gon <- c(out1$lower,rev(out1$higher))
polyg2 <- as.data.frame(cbind(poly,gon))

{
png("figure4_final.png", width = 10, height = 13, units="in", res=600, bg="white")

par(mfrow=c(2,1))
par(mar=c(5.1, 5.1, 1.1, 2.1))

plot(out1$start,out1$d.media, type="n", ylim=c(-1,1),ann=F, axes=F)
abline(h=0, col="gray", lty=2)
polygon(polyg1$poly,polyg1$gon,border=NA,col="gray",density=36,angle=90)
lines(out1$start,out1$d.media, lwd=2)
axis(1, labels=out1$ym[c(seq(1,200,by=12))], at=c(seq(1,200,by=12)))
axis(2,las=1)
mtext("Coefficient: Ch. Media(t) Effect on Ch. Public(t)",side=2, line=3.5)
#mtext("Start date (for 60-month window)",side=1, line=2.5)

plot(out1$start,out1$l.public, type="n", ylim=c(-1,1),ann=F, axes=F)
abline(h=0, col="gray", lty=2)
polygon(polyg2$poly,polyg2$gon,border=NA,col="gray",density=36,angle=90)
lines(out1$start,out1$l.media, lwd=2)
axis(1, labels=out1$ym[c(seq(1,200,by=12))], at=c(seq(1,200,by=12)))
axis(2,las=1)
mtext("Coefficient: Media(t-1) Effect on Ch. Public(t)",side=2, line=3.5)
mtext("Start date (for 60-month window)",side=1, line=2.5)

invisible(dev.off())
}

```

### Appendix

We do not provide code here to reproduce the content analysis of economic stories because they rely on proprietary data from Lexis Uni which cannot be shared publicly. However, the dictionaries used to search the media data are included below.

```{r econ dictionaries}

library(quanteda)
econ.dict <- dictionary(list(
macro = c("aggregate demand", "aggregate supply", "business cycle", "demand shock", "demand side", "demand-side", "econom*", "food price*", "industr*", "keynes*", "bear market", "bretton woods", "bull market", "coinage", "distribution of income", "central bank", "federal reserve", "fiscal", "gini coefficien*", "income distribution", "income inequalit", "interest rate", "monetar", "money supply", "price index", "price indices", "price level", "recession", "macroeconom", "microeconom", "supply and demand", "supply shock", "supply side", "supply-side", "cost of living", "supply of money", "labour market", "standard of living", "living standard", "gross national product", "gross domestic product", "gdp", "gnp", "gold standard") ,
unemp = c("unemploy*", "jobless*", "employm*")  ,
inf = c("inflation*", "cost of living", "deflation*", "hyperinflation*"),
stock = c("stocks")
))

```

